Specification Mining with Few False Positives

نویسندگان

Claire Le Goues

Westley Weimer

چکیده

Formal specifications can help with program testing, optimization, refactoring, documentation, and, most importantly, debugging and repair. Unfortunately, formal specifications are difficult to write manually and techniques that infer specifications automatically suffer from 90–99% false positive rates. Consequently, neither option is currently practiced for most software development projects. We present a novel technique that automatically infers partial correctness specifications with a very low false positive rate. We claim that existing specification miners yield false positives because they assign equal weight to all aspects of program behavior. For example, we grant less credence to duplicate code, infrequently-tested code, and code that has been changed often or recently. By using additional information from the software engineering process, we are able to dramatically reduce this rate. We evaluate our technique in two ways: as a preprocessing step for an existing specification miner and as part of a novel specification inference algorithm. Our technique identifies which traces are most indicative of program behavior, which allows off-the-shelf mining techniques to learn the same number of specifications using 60% of their original input. This results in many fewer false positives as compared to state of the art techniques, while still finding useful specifications on over 800,000 lines of

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Controlling False Positives in Association Rule Mining

Association rule mining is an important problem in the data mining area. It enumerates and tests a large number of rules on a dataset and outputs rules that satisfy user-specified constraints. Due to the large number of rules being tested, rules that do not represent real systematic effect in the data can satisfy the given constraints purely by random chance. Hence association rule mining often...

متن کامل

Reducing False Positives in the Construction of Adjective Scales

Many adjectives that appear to be synonyms of one another differ in their intensity. Distinguishing the nuances between adjective synonyms is vital to linguistic understanding of a language, but WordNet currently does not encode the relative intensities of adjective synonyms that lie on the scale. Sheinman & Tokunaga (2009) proposed a solution of constructing Adjective Scales by data mining a w...

متن کامل

Data mining and machine learning - Towards reducing false positives in intrusion detection

Intrusion Detection Systems (IDSs) are used to monitor computer systems for signs of security violations. Having detected such signs, IDSs trigger alerts to report them. These alerts are presented to a human analyst, who evaluates them and initiates an adequate response. In practice, IDSs have been observed to trigger thousands of alerts per day, most of which are mistakenly triggered by benign...

متن کامل

An Outlier Detection-Based Alert Reduction Model

Intrusion Detection Systems (IDSs) are widely deployed with increasing of unauthorized activities and attacks. However they often overload security managers by triggering thousands of alerts per day. And up to 99% of these alerts are false positives (i.e. alerts that are triggered incorrectly by benign events). This makes it extremely difficult for managers to correctly analyze security state a...

متن کامل

Attribute Weighting with Adaptive NBTree for Reducing False Positives in Intrusion Detection

In this paper, we introduce new learning algorithms for reducing false positives in intrusion detection. It is based on decision tree-based attribute weighting with adaptive naïve Bayesian tree, which not only reduce the false positives (FP) at acceptable level, but also scale up the detection rates (DR) for different types of network intrusions. Due to the tremendous growth of network-based se...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2009

Specification Mining with Few False Positives

نویسندگان

چکیده

منابع مشابه

Controlling False Positives in Association Rule Mining

Reducing False Positives in the Construction of Adjective Scales

Data mining and machine learning - Towards reducing false positives in intrusion detection

An Outlier Detection-Based Alert Reduction Model

Attribute Weighting with Adaptive NBTree for Reducing False Positives in Intrusion Detection

عنوان ژورنال:

اشتراک گذاری